Blaenau Gwent
Segment-Level Diffusion: A Framework for Controllable Long-Form Generation with Diffusion Language Models
Zhu, Xiaochen, Karadzhov, Georgi, Whitehouse, Chenxi, Vlachos, Andreas
Diffusion models have shown promise in text generation but often struggle with generating long, coherent, and contextually accurate text. Token-level diffusion overlooks word-order dependencies and enforces short output windows, while passage-level diffusion struggles with learning robust representation for long-form text. To address these challenges, we propose Segment-Level Diffusion (SLD), a framework that enhances diffusion-based text generation through text segmentation, robust representation training with adversarial and contrastive learning, and improved latent-space guidance. By segmenting long-form outputs into separate latent representations and decoding them with an autoregressive decoder, SLD simplifies diffusion predictions and improves scalability. Experiments on XSum, ROCStories, DialogSum, and DeliData demonstrate that SLD achieves competitive or superior performance in fluency, coherence, and contextual compatibility across automatic and human evaluation metrics comparing with other diffusion and autoregressive baselines. Ablation studies further validate the effectiveness of our segmentation and representation learning strategies.
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- Europe > United Kingdom > Wales > Blaenau Gwent (0.04)
- Africa > Rwanda > Kigali > Kigali (0.04)
- (17 more...)
- Research Report (0.82)
- Overview (0.68)
- Personal > Interview (0.46)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.68)
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.67)
A Knowledge-Injected Curriculum Pretraining Framework for Question Answering
Lin, Xin, Su, Tianhuang, Huang, Zhenya, Xue, Shangzi, Liu, Haifeng, Chen, Enhong
Knowledge-based question answering (KBQA) is a key task in NLP research, and also an approach to access the web data and knowledge, which requires exploiting knowledge graphs (KGs) for reasoning. In the literature, one promising solution for KBQA is to incorporate the pretrained language model (LM) with KGs by generating KG-centered pretraining corpus, which has shown its superiority. However, these methods often depend on specific techniques and resources to work, which may not always be available and restrict its application. Moreover, existing methods focus more on improving language understanding with KGs, while neglect the more important human-like complex reasoning. To this end, in this paper, we propose a general Knowledge-Injected Curriculum Pretraining framework (KICP) to achieve comprehensive KG learning and exploitation for KBQA tasks, which is composed of knowledge injection (KI), knowledge adaptation (KA) and curriculum reasoning (CR). Specifically, the KI module first injects knowledge into the LM by generating KG-centered pretraining corpus, and generalizes the process into three key steps that could work with different implementations for flexible application. Next, the KA module learns knowledge from the generated corpus with LM equipped with an adapter as well as keeps its original natural language understanding ability to reduce the negative impacts of the difference between the generated and natural corpus. Last, to enable the LM with complex reasoning, the CR module follows human reasoning patterns to construct three corpora with increasing difficulties of reasoning, and further trains the LM from easy to hard in a curriculum manner. We provide an implementation of the general framework, and evaluate the proposed KICP on four real-word datasets. The results demonstrate that our framework can achieve higher performances.
- Asia > Singapore > Central Region > Singapore (0.05)
- Asia > China > Anhui Province > Hefei (0.04)
- Europe > Greece (0.04)
- (7 more...)
- Research Report > New Finding (0.34)
- Research Report > Promising Solution (0.34)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
- Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.62)
Farhan Mirza jailed for blackmailing women with photos
A "sexual predator" has been jailed for eight and a half years for blackmailing and spying on Muslim women using intimate photographs and videos he took of them without their knowledge. Farhan Mirza, 38, of Abertillery, Blaenau Gwent, secretly filmed the women and threatened to share the footage before demanding money. Mirza, who denied the charges, met some of the women on online dating sites. He was jailed for voyeurism, blackmail, theft and fraud at Cardiff Crown Court. During the trial, jurors heard Mirza had initially impressed his victims by claiming to be a doctor and hung surgical scrubs in his wardrobe and carried a stethoscope in his car. He also claimed his family were highly educated professionals working in locations around the world.